Semi-Supervised Methods for Improving Keyword Search of Unseen Terms

نویسندگان

  • Scott Novotney
  • Ivan Bulyko
  • Richard M. Schwartz
  • Sanjeev Khudanpur
  • Owen Kimball
چکیده

We present a semi-supervised language modeling technique to improve search performance on terms without training data. Probabilities estimated from automatic transcripts of a large corpus of in-domain audio are added to an existing LM. Requiring neither development data or external resources, our method achieves 70% of the possible gain for manual transcription of the same audio. This is in sharp contrast to the modest gains of previous semi-supervised LM experiments. We compare the value of additional resources (labor or data) to semi-supervised learning. If human effort is available, we describe a transcription regime to efficiently close the remaining performance gap.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving semi-supervised deep neural network for keyword search in low resource languages

In this work, we investigate how to improve semi-supervised DNN for low resource languages where the initial systems may have high error rate. We propose using semi-supervised MLP features for DNN training, and we also explore using confidence to improve semi-supervised cross entropy and sequence training. The work conducted in this paper was evaluated under the IARPA Babel program for the keyw...

متن کامل

An Effective Path-aware Approach for Keyword Search over Data Graphs

Abstract—Keyword Search is known as a user-friendly alternative for structured languages to retrieve information from graph-structured data. Efficient retrieving of relevant answers to a keyword query and effective ranking of these answers according to their relevance are two main challenges in the keyword search over graph-structured data. In this paper, a novel scoring function is proposed, w...

متن کامل

Semi-supervised Induction with Basis Functions

Considerable progress was recently made on semi-supervised learning, which differs from the traditional supervised learning by additionally exploring the information of the unlabeled examples. However, a disadvantage of many existing methods is that it does not generalize to unseen inputs. This paper suggests a space of basis functions to perform semi-supervised inductive learning. As a nice pr...

متن کامل

Zero-Shot Learning via Class-Conditioned Deep Generative Models

We present a deep generative model for Zero-Shot Learning (ZSL). Unlike most existing methods for this problem, that represent each class as a point (via a semantic embedding), we represent each seen/unseen class using a classspecific latent-space distribution, conditioned on class attributes. We use these latent-space distributions as a prior for a supervised variational autoencoder (VAE), whi...

متن کامل

FDVQ based keyword spotter which incorporates a semi-supervised learning for primary processing

In this paper, we present a novel hybrid keyword spotting system that combines supervised and semi-supervised competitive learning algorithms. The rst stage is a S-SOM (Semi-supervised SelfOrganizing Map) module which is speci cally designed for discrimination between keywords (KWs) and non-keywords (NKWs). The second stage is an FDVQ (Fuzzy Dynamic Vector Quantization) module which consists of...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012